import besca as bc
import pkg_resources
./conda/envs/besca_test/lib/python3.6/site-packages/scanpy/api/__init__.py:6: FutureWarning: In a future version of Scanpy, `scanpy.api` will be removed. Simply use `import scanpy as sc` and `import scanpy.external as sce` instead. FutureWarning,
The datasets that are already annotated and should be used for training. If you only use one dataset please use list of one.
# the path to the datasets
train_dataset_paths = [pkg_resources.resource_filename('besca', 'datasets/data')]
#the names of the h5ad files
train_datasets = ['Smillie2019_processed.h5ad']
The dataset of interest that should be annotated.
test_dataset = 'Martin2019_processed.h5ad'
test_dataset_path = pkg_resources.resource_filename('besca', 'datasets/data')
Give your analysis a name.
analysis_name = 'auto_annot_Martin2019_with_Smillie2019_Type'
Specify column name of celltype annotation you want to train on.
celltype ='Type'
Choose a method:
method = 'logistic_regression'
Specify merge method if using multiple training datasets. Needs to be either scanorama or naive.
merge = 'scanorama'
Decide if you want to use the raw format or highly variable genes. Raw increases computational time and does not necessarily improve predictions.
use_raw = False
You can choose to only consider a subset of genes from a signature set.
genes_to_use = 'all'
adata_trains, adata_pred, adata_orig = bc.tl.auto_annot.read_data(train_paths = train_dataset_paths,train_datasets= train_datasets, test_path= test_dataset_path, test_dataset= test_dataset, use_raw = use_raw)
Transforming to str index.
Reading files
Transforming to str index.
adata_trains[0].obs
| CELL | Cluster | Health | Location | Subject | celltype_highlevel | nGene | nUMI | original_name | percent_mito | n_counts | n_genes | batch | leiden | dblabel | celltype | cluster_celltype | Type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | N7.EpiA.AAGCAAGAGTCAAC-Epi | Cycling TA | Non-inflamed | Epi | N7 | Epi | 1507 | 7428 | N7.EpiA.AAGCAAGAGTCAAC | 0.057351 | 7428.0 | 1507 | N7 | 8 | proliferating transit amplifying cell | epithelial cell | 8: epithelial cell | Epithelial |
| 1 | N7.EpiA.ACGAGGGAGCTGAT-Epi | Enterocyte Progenitors | Non-inflamed | Epi | N7 | Epi | 828 | 2877 | N7.EpiA.ACGAGGGAGCTGAT | 0.009037 | 2877.0 | 828 | N7 | 0 | enterocyte progenitor | epithelial cell | 0: epithelial cell | Epithelial |
| 2 | N7.EpiA.ACGTTTACTGGTAC-Epi | Immature Enterocytes 2 | Non-inflamed | Epi | N7 | Epi | 2318 | 15332 | N7.EpiA.ACGTTTACTGGTAC | 0.133707 | 15332.0 | 2318 | N7 | 7 | immature enterocyte | enterocyte | 7: enterocyte | Epithelial |
| 3 | N7.EpiA.AGAGAATGGTCATG-Epi | Enterocyte Progenitors | Non-inflamed | Epi | N7 | Epi | 884 | 3498 | N7.EpiA.AGAGAATGGTCATG | 0.002001 | 3498.0 | 884 | N7 | 7 | enterocyte progenitor | enterocyte | 7: enterocyte | Epithelial |
| 4 | N7.EpiA.AGAGCGGAGTATGC-Epi | TA 1 | Non-inflamed | Epi | N7 | Epi | 858 | 3261 | N7.EpiA.AGAGCGGAGTATGC | 0.003067 | 3261.0 | 858 | N7 | 0 | transit amplifying cell | epithelial cell | 0: epithelial cell | Epithelial |
| 5 | N7.EpiA.AGATTAACGCCATA-Epi | Cycling TA | Non-inflamed | Epi | N7 | Epi | 2743 | 20160 | N7.EpiA.AGATTAACGCCATA | 0.097123 | 20160.0 | 2743 | N7 | 8 | proliferating transit amplifying cell | epithelial cell | 8: epithelial cell | Epithelial |
| 6 | N7.EpiA.AGGATGCTTACAGC-Epi | TA 2 | Non-inflamed | Epi | N7 | Epi | 812 | 2764 | N7.EpiA.AGGATGCTTACAGC | 0.003618 | 2764.0 | 812 | N7 | 0 | transit amplifying cell | epithelial cell | 0: epithelial cell | Epithelial |
| 7 | N7.EpiA.AGGTACACAGACTC-Epi | Cycling TA | Non-inflamed | Epi | N7 | Epi | 3109 | 23926 | N7.EpiA.AGGTACACAGACTC | 0.062192 | 23926.0 | 3109 | N7 | 8 | proliferating transit amplifying cell | epithelial cell | 8: epithelial cell | Epithelial |
| 8 | N7.EpiA.AGTCTACTTCTCTA-Epi | TA 2 | Non-inflamed | Epi | N7 | Epi | 1945 | 12338 | N7.EpiA.AGTCTACTTCTCTA | 0.083806 | 12338.0 | 1945 | N7 | 0 | transit amplifying cell | epithelial cell | 0: epithelial cell | Epithelial |
| 9 | N7.EpiA.ATAGTCCTTAACCG-Epi | TA 2 | Non-inflamed | Epi | N7 | Epi | 2172 | 13869 | N7.EpiA.ATAGTCCTTAACCG | 0.128055 | 13869.0 | 2172 | N7 | 0 | transit amplifying cell | epithelial cell | 0: epithelial cell | Epithelial |
| 10 | N7.EpiA.ATATACGAAGTACC-Epi | TA 2 | Non-inflamed | Epi | N7 | Epi | 1245 | 5765 | N7.EpiA.ATATACGAAGTACC | 0.052559 | 5765.0 | 1245 | N7 | 0 | transit amplifying cell | epithelial cell | 0: epithelial cell | Epithelial |
| 11 | N7.EpiA.ATTCCAACTTTGGG-Epi | TA 2 | Non-inflamed | Epi | N7 | Epi | 1380 | 7240 | N7.EpiA.ATTCCAACTTTGGG | 0.072238 | 7240.0 | 1380 | N7 | 0 | transit amplifying cell | epithelial cell | 0: epithelial cell | Epithelial |
| 12 | N7.EpiA.ATTCGACTGCATAC-Epi | Cycling TA | Non-inflamed | Epi | N7 | Epi | 1043 | 3617 | N7.EpiA.ATTCGACTGCATAC | 0.004147 | 3617.0 | 1043 | N7 | 8 | proliferating transit amplifying cell | epithelial cell | 8: epithelial cell | Epithelial |
| 13 | N7.EpiA.ATTGAAACCTATGG-Epi | TA 1 | Non-inflamed | Epi | N7 | Epi | 959 | 4008 | N7.EpiA.ATTGAAACCTATGG | 0.003743 | 4008.0 | 959 | N7 | 0 | transit amplifying cell | epithelial cell | 0: epithelial cell | Epithelial |
| 14 | N7.EpiA.CAAGCCCTTGACAC-Epi | TA 2 | Non-inflamed | Epi | N7 | Epi | 1082 | 4640 | N7.EpiA.CAAGCCCTTGACAC | 0.067672 | 4640.0 | 1082 | N7 | 0 | transit amplifying cell | epithelial cell | 0: epithelial cell | Epithelial |
| 15 | N7.EpiA.CACGATGAGCTAAC-Epi | Cycling TA | Non-inflamed | Epi | N7 | Epi | 1082 | 3790 | N7.EpiA.CACGATGAGCTAAC | 0.002111 | 3790.0 | 1082 | N7 | 8 | proliferating transit amplifying cell | epithelial cell | 8: epithelial cell | Epithelial |
| 16 | N7.EpiA.CAGACATGTTTGTC-Epi | Cycling TA | Non-inflamed | Epi | N7 | Epi | 849 | 2844 | N7.EpiA.CAGACATGTTTGTC | 0.001055 | 2844.0 | 849 | N7 | 0 | proliferating transit amplifying cell | epithelial cell | 0: epithelial cell | Epithelial |
| 17 | N7.EpiA.CAGGAACTCGTTAG-Epi | Secretory TA | Non-inflamed | Epi | N7 | Epi | 1799 | 13535 | N7.EpiA.CAGGAACTCGTTAG | 0.103288 | 13535.0 | 1799 | N7 | 6 | transit amplifying cell | goblet cell | 6: goblet cell | Epithelial |
| 18 | N7.EpiA.CCACGGGAGTTACG-Epi | TA 2 | Non-inflamed | Epi | N7 | Epi | 1097 | 4700 | N7.EpiA.CCACGGGAGTTACG | 0.027234 | 4700.0 | 1097 | N7 | 0 | transit amplifying cell | epithelial cell | 0: epithelial cell | Epithelial |
| 19 | N7.EpiA.CGAGGAGACGAGTT-Epi | TA 2 | Non-inflamed | Epi | N7 | Epi | 1007 | 3832 | N7.EpiA.CGAGGAGACGAGTT | 0.071764 | 3832.0 | 1007 | N7 | 0 | transit amplifying cell | epithelial cell | 0: epithelial cell | Epithelial |
| 20 | N7.EpiA.CGCTAAGATAGCGT-Epi | Immature Goblet | Non-inflamed | Epi | N7 | Epi | 935 | 4147 | N7.EpiA.CGCTAAGATAGCGT | 0.005305 | 4147.0 | 935 | N7 | 6 | immature goblet cell | goblet cell | 6: goblet cell | Epithelial |
| 21 | N7.EpiA.CGGACTCTTGCCTC-Epi | TA 2 | Non-inflamed | Epi | N7 | Epi | 1720 | 9256 | N7.EpiA.CGGACTCTTGCCTC | 0.085566 | 9256.0 | 1720 | N7 | 0 | transit amplifying cell | epithelial cell | 0: epithelial cell | Epithelial |
| 22 | N7.EpiA.CGTCCATGGGTTCA-Epi | TA 2 | Non-inflamed | Epi | N7 | Epi | 804 | 2824 | N7.EpiA.CGTCCATGGGTTCA | 0.001771 | 2824.0 | 804 | N7 | 0 | transit amplifying cell | epithelial cell | 0: epithelial cell | Epithelial |
| 23 | N7.EpiA.CTGATGGACCCTCA-Epi | Secretory TA | Non-inflamed | Epi | N7 | Epi | 944 | 5081 | N7.EpiA.CTGATGGACCCTCA | 0.031096 | 5081.0 | 944 | N7 | 6 | transit amplifying cell | goblet cell | 6: goblet cell | Epithelial |
| 24 | N7.EpiA.CTTAAGCTTACGAC-Epi | Cycling TA | Non-inflamed | Epi | N7 | Epi | 2356 | 15497 | N7.EpiA.CTTAAGCTTACGAC | 0.066271 | 15497.0 | 2356 | N7 | 8 | proliferating transit amplifying cell | epithelial cell | 8: epithelial cell | Epithelial |
| 25 | N7.EpiA.GAGCTCCTACCACA-Epi | TA 2 | Non-inflamed | Epi | N7 | Epi | 1240 | 5349 | N7.EpiA.GAGCTCCTACCACA | 0.005235 | 5349.0 | 1240 | N7 | 0 | transit amplifying cell | epithelial cell | 0: epithelial cell | Epithelial |
| 26 | N7.EpiA.GAGCTCCTGCCTTC-Epi | Cycling TA | Non-inflamed | Epi | N7 | Epi | 3145 | 25630 | N7.EpiA.GAGCTCCTGCCTTC | 0.112563 | 25630.0 | 3145 | N7 | 6 | proliferating transit amplifying cell | goblet cell | 6: goblet cell | Epithelial |
| 27 | N7.EpiA.GATGCAACACGCAT-Epi | TA 2 | Non-inflamed | Epi | N7 | Epi | 981 | 4075 | N7.EpiA.GATGCAACACGCAT | 0.006871 | 4075.0 | 981 | N7 | 0 | transit amplifying cell | epithelial cell | 0: epithelial cell | Epithelial |
| 28 | N7.EpiA.GCGAGAGATCGTTT-Epi | Cycling TA | Non-inflamed | Epi | N7 | Epi | 1422 | 6884 | N7.EpiA.GCGAGAGATCGTTT | 0.107641 | 6884.0 | 1422 | N7 | 8 | proliferating transit amplifying cell | epithelial cell | 8: epithelial cell | Epithelial |
| 29 | N7.EpiA.GGAACTTGTGGCAT-Epi | Enterocytes | Non-inflamed | Epi | N7 | Epi | 924 | 5051 | N7.EpiA.GGAACTTGTGGCAT | 0.072461 | 5051.0 | 924 | N7 | 7 | enterocyte | enterocyte | 7: enterocyte | Epithelial |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 149702 | N110.LPB.TTGAACGGTGCAACGA-Imm | Plasma | Inflamed | LP | N110 | Imm | 1246 | 8584 | N110.LPB.TTGAACGGTGCAACGA | 0.026445 | 8584.0 | 1246 | N110 | 1 | plasma cell | plasma cell | 1: plasma cell | B_cells |
| 149703 | N110.LPB.TTGAACGTCACGCATA-Imm | Plasma | Inflamed | LP | N110 | Imm | 1198 | 7094 | N110.LPB.TTGAACGTCACGCATA | 0.020863 | 7094.0 | 1198 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149704 | N110.LPB.TTGAACGTCCACTGGG-Imm | Plasma | Inflamed | LP | N110 | Imm | 1179 | 7678 | N110.LPB.TTGAACGTCCACTGGG | 0.011071 | 7678.0 | 1179 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149705 | N110.LPB.TTGAACGTCGTACGGC-Imm | CD8+ LP | Inflamed | LP | N110 | Imm | 860 | 2072 | N110.LPB.TTGAACGTCGTACGGC | 0.057432 | 2072.0 | 860 | N110 | 3 | CD8-positive, alpha-beta T cell | T cell or ILC | 3: T cell or ILC | T_cells |
| 149706 | N110.LPB.TTGACTTCATCCTAGA-Imm | Plasma | Inflamed | LP | N110 | Imm | 1420 | 9090 | N110.LPB.TTGACTTCATCCTAGA | 0.016832 | 9090.0 | 1420 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149707 | N110.LPB.TTGACTTTCGTACGGC-Imm | Plasma | Inflamed | LP | N110 | Imm | 1588 | 11496 | N110.LPB.TTGACTTTCGTACGGC | 0.012961 | 11496.0 | 1588 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149708 | N110.LPB.TTGCCGTAGAGCAATT-Imm | CD4+ Activated Fos-lo | Inflamed | LP | N110 | Imm | 814 | 2466 | N110.LPB.TTGCCGTAGAGCAATT | 0.027981 | 2466.0 | 814 | N110 | 3 | activated CD4-positive, alpha-beta T cell | T cell or ILC | 3: T cell or ILC | T_cells |
| 149709 | N110.LPB.TTGCCGTAGTCAAGCG-Imm | Plasma | Inflamed | LP | N110 | Imm | 1223 | 6920 | N110.LPB.TTGCCGTAGTCAAGCG | 0.016763 | 6920.0 | 1223 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149710 | N110.LPB.TTGCCGTGTCAGATAA-Imm | Plasma | Inflamed | LP | N110 | Imm | 1672 | 13511 | N110.LPB.TTGCCGTGTCAGATAA | 0.014063 | 13511.0 | 1672 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149711 | N110.LPB.TTGCCGTGTGTGTGCC-Imm | Plasma | Inflamed | LP | N110 | Imm | 1499 | 11399 | N110.LPB.TTGCCGTGTGTGTGCC | 0.011405 | 11399.0 | 1499 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149712 | N110.LPB.TTGCCGTTCAAAGACA-Imm | Plasma | Inflamed | LP | N110 | Imm | 1036 | 5960 | N110.LPB.TTGCCGTTCAAAGACA | 0.021980 | 5960.0 | 1036 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149713 | N110.LPB.TTGCGTCAGCTCCTTC-Imm | Follicular | Inflamed | LP | N110 | Imm | 853 | 2033 | N110.LPB.TTGCGTCAGCTCCTTC | 0.052632 | 2033.0 | 853 | N110 | 10 | follicular B cell | B cell | 10: B cell | B_cells |
| 149714 | N110.LPB.TTGGAACCATGCCTTC-Imm | Plasma | Inflamed | LP | N110 | Imm | 895 | 5828 | N110.LPB.TTGGAACCATGCCTTC | 0.012011 | 5828.0 | 895 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149715 | N110.LPB.TTGGAACTCTCTGTCG-Imm | Plasma | Inflamed | LP | N110 | Imm | 1313 | 8122 | N110.LPB.TTGGAACTCTCTGTCG | 0.019576 | 8122.0 | 1313 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149716 | N110.LPB.TTGTAGGTCCTTGACC-Imm | Plasma | Inflamed | LP | N110 | Imm | 837 | 4026 | N110.LPB.TTGTAGGTCCTTGACC | 0.034526 | 4026.0 | 837 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149717 | N110.LPB.TTTACTGAGAACAACT-Imm | Plasma | Inflamed | LP | N110 | Imm | 1102 | 7834 | N110.LPB.TTTACTGAGAACAACT | 0.013786 | 7834.0 | 1102 | N110 | 1 | plasma cell | plasma cell | 1: plasma cell | B_cells |
| 149718 | N110.LPB.TTTACTGTCTACGAGT-Imm | Plasma | Inflamed | LP | N110 | Imm | 1432 | 9233 | N110.LPB.TTTACTGTCTACGAGT | 0.014946 | 9233.0 | 1432 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149719 | N110.LPB.TTTATGCAGACGCACA-Imm | Plasma | Inflamed | LP | N110 | Imm | 1343 | 9698 | N110.LPB.TTTATGCAGACGCACA | 0.017014 | 9698.0 | 1343 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149720 | N110.LPB.TTTATGCAGTACTTGC-Imm | Cycling Monocytes | Inflamed | LP | N110 | Imm | 2101 | 6771 | N110.LPB.TTTATGCAGTACTTGC | 0.058189 | 6771.0 | 2101 | N110 | 12 | proliferating monocyte | macrophage | 12: macrophage | Myeloid |
| 149721 | N110.LPB.TTTATGCGTTTACTCT-Imm | Plasma | Inflamed | LP | N110 | Imm | 1166 | 7139 | N110.LPB.TTTATGCGTTTACTCT | 0.021151 | 7139.0 | 1166 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149722 | N110.LPB.TTTATGCTCCGCGTTT-Imm | Macrophages | Inflamed | LP | N110 | Imm | 1545 | 5224 | N110.LPB.TTTATGCTCCGCGTTT | 0.043453 | 5224.0 | 1545 | N110 | 12 | macrophage | macrophage | 12: macrophage | Myeloid |
| 149723 | N110.LPB.TTTATGCTCTTGGGTA-Imm | Plasma | Inflamed | LP | N110 | Imm | 959 | 5389 | N110.LPB.TTTATGCTCTTGGGTA | 0.017257 | 5389.0 | 959 | N110 | 4 | plasma cell | plasma cell | 4: plasma cell | B_cells |
| 149724 | N110.LPB.TTTCCTCAGCGCTCCA-Imm | Plasma | Inflamed | LP | N110 | Imm | 947 | 5470 | N110.LPB.TTTCCTCAGCGCTCCA | 0.024314 | 5470.0 | 947 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149725 | N110.LPB.TTTCCTCAGTACGACG-Imm | Plasma | Inflamed | LP | N110 | Imm | 1164 | 6099 | N110.LPB.TTTCCTCAGTACGACG | 0.020659 | 6099.0 | 1164 | N110 | 4 | plasma cell | plasma cell | 4: plasma cell | B_cells |
| 149726 | N110.LPB.TTTCCTCTCAAAGACA-Imm | Plasma | Inflamed | LP | N110 | Imm | 911 | 4672 | N110.LPB.TTTCCTCTCAAAGACA | 0.034461 | 4672.0 | 911 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149727 | N110.LPB.TTTGCGCAGGGTTCCC-Imm | Macrophages | Inflamed | LP | N110 | Imm | 1172 | 3549 | N110.LPB.TTTGCGCAGGGTTCCC | 0.063398 | 3549.0 | 1172 | N110 | 12 | macrophage | macrophage | 12: macrophage | Myeloid |
| 149728 | N110.LPB.TTTGCGCCATGTCGAT-Imm | Plasma | Inflamed | LP | N110 | Imm | 1356 | 9534 | N110.LPB.TTTGCGCCATGTCGAT | 0.019194 | 9534.0 | 1356 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149729 | N110.LPB.TTTGCGCTCAACGAAA-Imm | CD4+ Activated Fos-hi | Inflamed | LP | N110 | Imm | 858 | 2084 | N110.LPB.TTTGCGCTCAACGAAA | 0.051344 | 2084.0 | 858 | N110 | 3 | activated CD4-positive, alpha-beta T cell | T cell or ILC | 3: T cell or ILC | T_cells |
| 149730 | N110.LPB.TTTGCGCTCAACGGCC-Imm | Plasma | Inflamed | LP | N110 | Imm | 1753 | 12962 | N110.LPB.TTTGCGCTCAACGGCC | 0.019981 | 12962.0 | 1753 | N110 | 2 | plasma cell | plasma cell | 2: plasma cell | B_cells |
| 149731 | N110.LPB.TTTGTCAGTTGACGTT-Imm | Macrophages | Inflamed | LP | N110 | Imm | 965 | 2696 | N110.LPB.TTTGTCAGTTGACGTT | 0.051187 | 2696.0 | 965 | N110 | 12 | macrophage | macrophage | 12: macrophage | Myeloid |
149732 rows × 18 columns
adata_pred.obs
| CELL | CONDITION | Sample_geo_accession | Sample_title | Subject | tissue | status | 10x chemistry | Sample_relation | Sample_relation_2 | ... | Lane | Subtype | percent_mito | n_counts | n_genes | batch | leiden | dblabel | celltype | cluster_celltype | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| index | |||||||||||||||||||||
| GSM3972009_69.AAACATACACACCA-1 | GSM3972009_69.AAACATACACACCA-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | Central Memory T cells | 0.012165 | 1233.0 | 500 | pat. 5 | 0 | CD4-positive, alpha-beta memory T cell | T cell | 0: T cell |
| GSM3972009_69.AAACATTGGTGTCA-1 | GSM3972009_69.AAACATTGGTGTCA-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | unknown | unknown | 0.012548 | 4142.0 | 1277 | pat. 5 | 12 | fibroblast | fibroblast | 12: fibroblast |
| GSM3972009_69.AAACGCACTTAGGC-1 | GSM3972009_69.AAACGCACTTAGGC-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | Activated fibroblasts | 0.006716 | 5806.0 | 1727 | pat. 5 | 12 | fibroblast | fibroblast | 12: fibroblast |
| GSM3972009_69.AAACGCTGCTACCC-1 | GSM3972009_69.AAACGCTGCTACCC-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | Tregs | 0.010526 | 1327.0 | 627 | pat. 5 | 5 | regulatory T cell | T cell | 5: T cell |
| GSM3972009_69.AAACTTGAGTCACA-1 | GSM3972009_69.AAACTTGAGTCACA-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | Pericytes | 0.013407 | 3803.0 | 1327 | pat. 5 | 21 | pericyte cell | pericyte cell | 21: pericyte cell |
| GSM3972009_69.AAACTTGATCACCC-1 | GSM3972009_69.AAACTTGATCACCC-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | IgA plasma cells | 0.006319 | 6330.0 | 864 | pat. 5 | 11 | IgG plasma cell | plasma cell | 11: plasma cell |
| GSM3972009_69.AAAGACGAATCACG-1 | GSM3972009_69.AAAGACGAATCACG-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | Plasmablasts | 0.015349 | 10294.0 | 2204 | pat. 5 | 19 | proliferating T cell | T cell | 19: T cell |
| GSM3972009_69.AAAGACGAATGCTG-1 | GSM3972009_69.AAAGACGAATGCTG-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | Resident macrophages | 0.027449 | 1858.0 | 644 | pat. 5 | 6 | myeloid leukocyte | myeloid leukocyte | 6: myeloid leukocyte |
| GSM3972009_69.AAAGATCTGTATCG-1 | GSM3972009_69.AAAGATCTGTATCG-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | DC2 | 0.017936 | 2732.0 | 870 | pat. 5 | 6 | myeloid leukocyte | myeloid leukocyte | 6: myeloid leukocyte |
| GSM3972009_69.AAAGATCTTATCGG-1 | GSM3972009_69.AAAGATCTTATCGG-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | IgG plasma cells | 0.004392 | 15480.0 | 920 | pat. 5 | 11 | IgG plasma cell | plasma cell | 11: plasma cell |
| GSM3972009_69.AAAGCAGAGGTTCA-1 | GSM3972009_69.AAAGCAGAGGTTCA-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | DC1 | 0.016954 | 3657.0 | 875 | pat. 5 | 6 | myeloid leukocyte | myeloid leukocyte | 6: myeloid leukocyte |
| GSM3972009_69.AAAGCAGATTGGCA-1 | GSM3972009_69.AAAGCAGATTGGCA-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | Plasmablasts | 0.019023 | 5467.0 | 1233 | pat. 5 | 11 | IgG plasma cell | plasma cell | 11: plasma cell |
| GSM3972009_69.AAAGCAGATTTCGT-1 | GSM3972009_69.AAAGCAGATTTCGT-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | ACKR1+ endothelial cells | 0.015983 | 1438.0 | 692 | pat. 5 | 14 | blood vessel endothelial cell | endothelial cell | 14: endothelial cell |
| GSM3972009_69.AAAGCCTGACTGTG-1 | GSM3972009_69.AAAGCCTGACTGTG-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | Activated fibroblasts | 0.087591 | 1643.0 | 816 | pat. 5 | 12 | fibroblast | fibroblast | 12: fibroblast |
| GSM3972009_69.AAAGCCTGAGCCTA-1 | GSM3972009_69.AAAGCCTGAGCCTA-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | Central Memory T cells | 0.017105 | 1520.0 | 595 | pat. 5 | 3 | naive T cell | T cell | 3: T cell |
| GSM3972009_69.AAAGTTTGTCTAGG-1 | GSM3972009_69.AAAGTTTGTCTAGG-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | Activated fibroblasts | 0.013503 | 7848.0 | 1974 | pat. 5 | 12 | fibroblast | fibroblast | 12: fibroblast |
| GSM3972009_69.AAATCAACTTCTTG-1 | GSM3972009_69.AAATCAACTTCTTG-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | IgG plasma cells | 0.004730 | 6342.0 | 847 | pat. 5 | 11 | IgG plasma cell | plasma cell | 11: plasma cell |
| GSM3972009_69.AAATCATGAATGCC-1 | GSM3972009_69.AAATCATGAATGCC-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | IgG plasma cells | 0.002807 | 16389.0 | 1288 | pat. 5 | 11 | IgG plasma cell | plasma cell | 11: plasma cell |
| GSM3972009_69.AAATCCCTGTCGTA-1 | GSM3972009_69.AAATCCCTGTCGTA-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | IgG plasma cells | 0.006611 | 8923.0 | 930 | pat. 5 | 11 | IgG plasma cell | plasma cell | 11: plasma cell |
| GSM3972009_69.AAATCTGAGCTAAC-1 | GSM3972009_69.AAATCTGAGCTAAC-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | IgG plasma cells | 0.015385 | 12282.0 | 1639 | pat. 5 | 6 | myeloid leukocyte | myeloid leukocyte | 6: myeloid leukocyte |
| GSM3972009_69.AAATGGGAGGACAG-1 | GSM3972009_69.AAATGGGAGGACAG-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | IgA plasma cells | 0.007100 | 7746.0 | 971 | pat. 5 | 10 | IgM or IgA plasma cell | plasma cell | 10: plasma cell |
| GSM3972009_69.AAATGTTGGTGCTA-1 | GSM3972009_69.AAATGTTGGTGCTA-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | Activated fibroblasts | 0.027858 | 4519.0 | 1331 | pat. 5 | 12 | fibroblast | fibroblast | 12: fibroblast |
| GSM3972009_69.AAATTCGACAACCA-1 | GSM3972009_69.AAATTCGACAACCA-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | IgG plasma cells | 0.015620 | 3201.0 | 745 | pat. 5 | 11 | IgG plasma cell | plasma cell | 11: plasma cell |
| GSM3972009_69.AACAAACTGGTCAT-1 | GSM3972009_69.AACAAACTGGTCAT-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | DC1 | 0.033011 | 1757.0 | 617 | pat. 5 | 6 | myeloid leukocyte | myeloid leukocyte | 6: myeloid leukocyte |
| GSM3972009_69.AACACGTGAAGATG-1 | GSM3972009_69.AACACGTGAAGATG-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | IgG plasma cells | 0.002581 | 13167.0 | 880 | pat. 5 | 11 | IgG plasma cell | plasma cell | 11: plasma cell |
| GSM3972009_69.AACACGTGTCGCTC-1 | GSM3972009_69.AACACGTGTCGCTC-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | IgG plasma cells | 0.005225 | 8230.0 | 710 | pat. 5 | 11 | IgG plasma cell | plasma cell | 11: plasma cell |
| GSM3972009_69.AACACTCTAGCAAA-1 | GSM3972009_69.AACACTCTAGCAAA-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | Activated fibroblasts | 0.013588 | 4340.0 | 1368 | pat. 5 | 12 | fibroblast | fibroblast | 12: fibroblast |
| GSM3972009_69.AACAGAGAATTTCC-1 | GSM3972009_69.AACAGAGAATTTCC-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | IgG plasma cells | 0.006489 | 3081.0 | 525 | pat. 5 | 11 | IgG plasma cell | plasma cell | 11: plasma cell |
| GSM3972009_69.AACAGCACTCAGGT-1 | GSM3972009_69.AACAGCACTCAGGT-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | IgG plasma cells | 0.004187 | 10508.0 | 887 | pat. 5 | 11 | IgG plasma cell | plasma cell | 11: plasma cell |
| GSM3972009_69.AACAGCACTGCACA-1 | GSM3972009_69.AACAGCACTGCACA-1 | Involved | GSM3972009 | Ileal Involved 69 | pat. 5 | ileal | Involved | V1 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 69.0 | Central Memory T cells | 0.015778 | 1331.0 | 533 | pat. 5 | 5 | regulatory T cell | T cell | 5: T cell |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| GSM3972030_209.TTTATGCCATTTGCCC-1 | GSM3972030_209.TTTATGCCATTTGCCC-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | Th17 Trm | 0.025601 | 3164.0 | 946 | pat. 16 | 7 | CD8-positive, alpha-beta memory T cell | T cell | 7: T cell |
| GSM3972030_209.TTTATGCGTGCATCTA-1 | GSM3972030_209.TTTATGCGTGCATCTA-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | Tregs | 0.020492 | 2928.0 | 901 | pat. 16 | 5 | regulatory T cell | T cell | 5: T cell |
| GSM3972030_209.TTTCCTCAGGGCACTA-1 | GSM3972030_209.TTTCCTCAGGGCACTA-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | CD8 Trm | 0.067949 | 1663.0 | 522 | pat. 16 | 18 | memory T cell | T cell | 18: T cell |
| GSM3972030_209.TTTCCTCAGTGGCACA-1 | GSM3972030_209.TTTCCTCAGTGGCACA-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | Tregs | 0.074179 | 1766.0 | 625 | pat. 16 | 5 | regulatory T cell | T cell | 5: T cell |
| GSM3972030_209.TTTCCTCCAGCTCGCA-1 | GSM3972030_209.TTTCCTCCAGCTCGCA-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | Tregs | 0.050861 | 2556.0 | 884 | pat. 16 | 5 | regulatory T cell | T cell | 5: T cell |
| GSM3972030_209.TTTCCTCCATTCACTT-1 | GSM3972030_209.TTTCCTCCATTCACTT-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | CD8 Trm | 0.075423 | 2307.0 | 813 | pat. 16 | 1 | CD8-positive, alpha-beta cytotoxic T cell | T cell | 1: T cell |
| GSM3972030_209.TTTCCTCGTAGATTAG-1 | GSM3972030_209.TTTCCTCGTAGATTAG-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | Resident macrophages | 0.048736 | 3837.0 | 1408 | pat. 16 | 6 | myeloid leukocyte | myeloid leukocyte | 6: myeloid leukocyte |
| GSM3972030_209.TTTCCTCGTCGCGGTT-1 | GSM3972030_209.TTTCCTCGTCGCGGTT-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | CD8 Trm | 0.026460 | 3477.0 | 1004 | pat. 16 | 1 | CD8-positive, alpha-beta cytotoxic T cell | T cell | 1: T cell |
| GSM3972030_209.TTTCCTCTCCTTCAAT-1 | GSM3972030_209.TTTCCTCTCCTTCAAT-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | IgA plasma cells | 0.013345 | 10116.0 | 1344 | pat. 16 | 4 | IgM or IgA plasma cell | plasma cell | 4: plasma cell |
| GSM3972030_209.TTTCCTCTCGAACTGT-1 | GSM3972030_209.TTTCCTCTCGAACTGT-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | CD8 Trm | 0.077309 | 1397.0 | 545 | pat. 16 | 1 | CD8-positive, alpha-beta cytotoxic T cell | T cell | 1: T cell |
| GSM3972030_209.TTTGCGCAGACGCACA-1 | GSM3972030_209.TTTGCGCAGACGCACA-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | Fibroblasts | 0.025401 | 3301.0 | 1504 | pat. 16 | 12 | fibroblast | fibroblast | 12: fibroblast |
| GSM3972030_209.TTTGCGCAGGTAGCCA-1 | GSM3972030_209.TTTGCGCAGGTAGCCA-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | Memory B cells | 0.052716 | 8744.0 | 2026 | pat. 16 | 2 | memory B cell | B cell | 2: B cell |
| GSM3972030_209.TTTGCGCCAAGTAATG-1 | GSM3972030_209.TTTGCGCCAAGTAATG-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | Memory B cells | 0.052926 | 3760.0 | 920 | pat. 16 | 2 | memory B cell | B cell | 2: B cell |
| GSM3972030_209.TTTGCGCCACGGACAA-1 | GSM3972030_209.TTTGCGCCACGGACAA-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | CD8 Trm | 0.034401 | 1686.0 | 605 | pat. 16 | 1 | CD8-positive, alpha-beta cytotoxic T cell | T cell | 1: T cell |
| GSM3972030_209.TTTGCGCGTCAGAATA-1 | GSM3972030_209.TTTGCGCGTCAGAATA-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | Lymphatics | 0.126883 | 2185.0 | 1064 | pat. 16 | 23 | HEV endothelial cell | endothelial cell | 23: endothelial cell |
| GSM3972030_209.TTTGCGCTCAGTCCCT-1 | GSM3972030_209.TTTGCGCTCAGTCCCT-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | IgA plasma cells | 0.020774 | 17618.0 | 1489 | pat. 16 | 13 | IgM or IgA plasma cell | plasma cell | 13: plasma cell |
| GSM3972030_209.TTTGCGCTCCAGTAGT-1 | GSM3972030_209.TTTGCGCTCCAGTAGT-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | Tregs | 0.043265 | 2034.0 | 708 | pat. 16 | 5 | regulatory T cell | T cell | 5: T cell |
| GSM3972030_209.TTTGGTTCAAAGCGGT-1 | GSM3972030_209.TTTGGTTCAAAGCGGT-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | CD8 Trm | 0.067089 | 2370.0 | 878 | pat. 16 | 1 | CD8-positive, alpha-beta cytotoxic T cell | T cell | 1: T cell |
| GSM3972030_209.TTTGGTTCAATGTTGC-1 | GSM3972030_209.TTTGGTTCAATGTTGC-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | DC1 | 0.079333 | 8395.0 | 1970 | pat. 16 | 6 | myeloid leukocyte | myeloid leukocyte | 6: myeloid leukocyte |
| GSM3972030_209.TTTGGTTCATTCCTGC-1 | GSM3972030_209.TTTGGTTCATTCCTGC-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | Memory B cells | 0.056118 | 4150.0 | 1165 | pat. 16 | 2 | memory B cell | B cell | 2: B cell |
| GSM3972030_209.TTTGGTTGTGTAATGA-1 | GSM3972030_209.TTTGGTTGTGTAATGA-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | Th17 Trm | 0.031838 | 5370.0 | 1679 | pat. 16 | 0 | CD4-positive, alpha-beta memory T cell | T cell | 0: T cell |
| GSM3972030_209.TTTGGTTTCAAACGGG-1 | GSM3972030_209.TTTGGTTTCAAACGGG-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | IgA plasma cells | 0.060678 | 7136.0 | 1289 | pat. 16 | 13 | IgM or IgA plasma cell | plasma cell | 13: plasma cell |
| GSM3972030_209.TTTGGTTTCCTGCAGG-1 | GSM3972030_209.TTTGGTTTCCTGCAGG-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | Th17 Trm | 0.000573 | 3493.0 | 1213 | pat. 16 | 0 | CD4-positive, alpha-beta memory T cell | T cell | 0: T cell |
| GSM3972030_209.TTTGTCAAGCATCATC-1 | GSM3972030_209.TTTGTCAAGCATCATC-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | CD8 Trm | 0.121032 | 1512.0 | 533 | pat. 16 | 18 | memory T cell | T cell | 18: T cell |
| GSM3972030_209.TTTGTCAAGGTGATTA-1 | GSM3972030_209.TTTGTCAAGGTGATTA-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | IgA plasma cells | 0.024231 | 11802.0 | 1626 | pat. 16 | 13 | IgM or IgA plasma cell | plasma cell | 13: plasma cell |
| GSM3972030_209.TTTGTCAGTCTCCCTA-1 | GSM3972030_209.TTTGTCAGTCTCCCTA-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | IgA plasma cells | 0.048219 | 4998.0 | 795 | pat. 16 | 10 | IgM or IgA plasma cell | plasma cell | 10: plasma cell |
| GSM3972030_209.TTTGTCAGTCTCTTAT-1 | GSM3972030_209.TTTGTCAGTCTCTTAT-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | CD8 Trm | 0.031284 | 1822.0 | 749 | pat. 16 | 1 | CD8-positive, alpha-beta cytotoxic T cell | T cell | 1: T cell |
| GSM3972030_209.TTTGTCAGTGTGGTTT-1 | GSM3972030_209.TTTGTCAGTGTGGTTT-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | ILC3 | 0.100821 | 2190.0 | 740 | pat. 16 | 16 | group 3 innate lymphoid cell | ILC3 | 16: ILC3 |
| GSM3972030_209.TTTGTCATCAGTTAGC-1 | GSM3972030_209.TTTGTCATCAGTTAGC-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | Trm | 0.040230 | 1740.0 | 611 | pat. 16 | 0 | CD4-positive, alpha-beta memory T cell | T cell | 0: T cell |
| GSM3972030_209.TTTGTCATCGTCCAGG-1 | GSM3972030_209.TTTGTCATCGTCCAGG-1 | Involved | GSM3972030 | Ileal Involved 209 | pat. 16 | ileal | Involved | V2 | BioSample: https://www.ncbi.nlm.nih.gov/biosam... | SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX... | ... | 209.0 | CD8 Trm | 0.030513 | 2884.0 | 976 | pat. 16 | 1 | CD8-positive, alpha-beta cytotoxic T cell | T cell | 1: T cell |
62202 rows × 27 columns
This function merges training datasets, removes unwanted genes, and if scanorama is used corrects for datasets.
adata_train, adata_pred = bc.tl.auto_annot.merge_data(adata_trains, adata_pred, genes_to_use = genes_to_use, merge = merge)
merging with scanorama using scanorama rn Found 1054 genes among all datasets [[0. 0.55768303] [0. 0. ]] Processing datasets (0, 1) integrating training set calculating intersection
The returned scaler is fitted on the training dataset (to zero mean and scaled to unit variance).
classifier, scaler = bc.tl.auto_annot.fit(adata_train, method, celltype)
[Parallel(n_jobs=10)]: Using backend LokyBackend with 10 concurrent workers.
[Parallel(n_jobs=10)]: Done 5 out of 5 | elapsed: 11.9min finished
./conda/envs/besca_test/lib/python3.6/site-packages/sklearn/linear_model/_logistic.py:940: ConvergenceWarning:
lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Use fitted model to predict celltypes in adata_pred. Prediction will be added in a new column called 'auto_annot'. Paths are needed as adata_pred will revert to its original state (all genes, no additional corrections). The threshold should be set to 0 or left out for SVM. For logisitic regression the threshold can be set.
adata_predicted = bc.tl.auto_annot.adata_predict(classifier = classifier, scaler = scaler, adata_pred = adata_pred, adata_orig = adata_orig, threshold = 0.1)
Write out metrics to a report file, create confusion matrices and comparative umap plots
%matplotlib inline
bc.tl.auto_annot.report(adata_predicted, celltype, method, analysis_name, train_datasets, test_dataset, False, merge, use_raw, genes_to_use, clustering = 'leiden')
./conda/envs/besca_test/lib/python3.6/site-packages/sklearn/metrics/_classification.py:1272: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior. ./conda/envs/besca_test/lib/python3.6/site-packages/sklearn/metrics/_classification.py:1272: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior. ... storing 'auto_annot' as categorical
WARNING: saving figure to file figures/umap.ondata_auto_annot_Martin2019_with_Smillie2019_Type.png
WARNING: saving figure to file figures/umap.auto_annot_Martin2019_with_Smillie2019_Type.png
Confusion matrix, without normalization Normalized confusion matrix
import scanpy as sc
sc.pl.umap(adata_predicted, color=[celltype, 'auto_annot'])
sc.pl.umap(adata_predicted, color=[celltype, 'auto_annot'], legend_loc='on data', legend_fontsize=8)
adata_train
View of AnnData object with n_obs × n_vars = 149732 × 1054
obs: 'CELL', 'Cluster', 'Health', 'Location', 'Subject', 'celltype_highlevel', 'nGene', 'nUMI', 'original_name', 'percent_mito', 'n_counts', 'n_genes', 'batch', 'leiden', 'dblabel', 'celltype', 'cluster_celltype', 'Type'
var: 'ENSEMBL-0', 'SYMBOL', 'n_cells-0', 'total_counts-0', 'frac_reads-0', 'ENSEMBL-1', 'n_cells-1', 'total_counts-1', 'frac_reads-1'